D Partition Algorithms – A Study and Emergence of Mining Projected Clusters in High - Dimensional Dataset
نویسندگان
چکیده
High-dimensional data has a major challenge due to the inherent sparsity of the points. Existing clustering algorithms are inefficient to the required similarity measure is computed between data points in the full-dimensional space. In this work, a number of projected clustering algorithms have been analyzed. However, most of them encounter difficulties when clusters hide in subspaces with very low dimensionality. These challenges motivate effort to propose a reliable K-mediods partitional distance-based projected clustering algorithm. The proposed process is based on the K-Means and K-Mediods algorithm, with the computation of distance restricted to subsets of attributes where object values are dense. K-mediod algorithm is capable of detecting projected clusters of low dimensionality embedded in a high-dimensional space and avoids the computation of the distance in the full-dimensional space. Our research article is based on analysis of the effective performance of K-mediods.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملClustering high dimensional data using subspace and projected clustering algorithms
Problem statement: Clustering has a number of techniques that have been developed in statistics, pattern recognition, data mining, and other fields. Subspace clustering enumerates clusters of objects in all subspaces of a dataset. It tends to produce many over lapping clusters. Approach: Subspace clustering and projected clustering are research areas for clustering in high dimensional spaces. I...
متن کاملReview Paper on Clustering Techniques
The purpose of the data mining technique is to mine information from a bulky data set and make over it into a reasonable form for supplementary purpose. Clustering is a significant task in data analysis and data mining applications. It is the task of arrangement a set of objects so that objects in the identical group are more related to each other than to those in other groups (clusters). Data ...
متن کاملFrequent-Pattern based Iterative Projected Clustering
Irrelevant attributes add noise to high dimensional clusters and make traditional clustering techniques inappropriate. Projected clustering algorithms have been proposed to find the clusters in hidden subspaces. We realize the analogy between mining frequent itemsets and discovering the relevant subspace for a given cluster. We propose a methodology for finding projected clusters by mining freq...
متن کاملA Comprehensive Study of Several Meta-Heuristic Algorithms for Open-Pit Mine Production Scheduling Problem Considering Grade Uncertainty
It is significant to discover a global optimization in the problems dealing with large dimensional scales to increase the quality of decision-making in the mining operation. It has been broadly confirmed that the long-term production scheduling (LTPS) problem performs a main role in mining projects to develop the performance regarding the obtainability of constraints, while maximizing the whole...
متن کامل